A Data Mining Approach for Detecting Collusion in Unproctored Online Exams
J. Langerbein(1), T. Massing(1), J. Klenke(1), N. Reckmann(1), M. Striewe(1),
M. Goedicke(1), C. Hanck(1)
(1) University of Duisburg-Essen; Germany
Setting
- Data from the Descriptive Statistics course at the University Duisburg-Essen, Germany
- Exams consist of arithmetical problems, programming tasks in
R, and a short essay task
- Both exams are conducted digitally with the e-assessment system JACK
- Each student receives different randomized numerical values across all tasks
- Event logs capture students’ activities, time stamps, and points during the exams for every subtask
- The test group took the unproctored exam at home during the COVID-19 pandemic
- The comparison group took a proctored exam in the facilities of the university
|
|
Comparison
|
Test
|
|
Year
|
18/19
|
20/21
|
|
N
|
109
|
151
|
|
Style
|
proctocred
|
unprocotored
|
|
Total points
|
60
|
60
|
|
Sub tasks
|
19
|
17
|
|
Duration
|
70
|
70
|
- Data cleaning is conducted, removing students with minimal participation or achievement and students with internet problems
Methodology
- The study utilized an agglomerative (bottom-up) hierarchical clustering algorithm that can be described by following equation:
\[D(s_i, s_{i'}, v_i, v_{i'}) = \frac{1}{h} \sum_{j=1}^h (w_j^P \cdot d_j^P (s_{ij}, s_{i'j}) + w_j^L \cdot d_j^L (v_{ij}, v_{i'j}))\]
- \(D(s_i, s_{i'}, v_i, v_{i'})\) the global pairwise dissimilarity
- \(d_j^P(s_{ij}, s_{i'j})\) points dissimilarity for each task \(j\)
- \(d_j^L(v_{ij}, v_{i'j})\) students event patterns dissimilarity for each task \(j\)
- \(\displaystyle \sum_{j=1}^h w_j^P + w_j^L =1\) control the influence of each attribute on the global object dissimilarity
- We reduce the weights for
R-tasks, as these tasks have more noise
- Essay questions, as the comparison on that kind of task are limited
- Points achieved
- Dissimilarities in the students event patterns (time of submission) for each task \(j\)
\[d_j^L(v_{ij}, v_{i'j}) = \sum_{m=1}^{K=70} | v_{ijm} - v_{i'jm} |\]
- \(d_j^L(v_{ij}, v_{i'j})\) students event patterns dissimilarity for each task \(j\)
- Examination is divided into \(m = 1, ... , 70\) time intervals
- \(v_{ijm}\) denotes the number of answers of student \(i\) for task \(j\) in the \(m\)-th interval
Empirical Results
- Figure 1 shows the dendrogram of the test group
- Overall a lower level of dissimilarity compared to the comparison group, suggesting possible collusion
- Six clusters (A-F) standing out noticeably from the rest of the cohort
Figure 2 illustrates the individual comparison of achieved points and event logs of the student cluster with the highest similarity.
Figure 3 compares the normalized distributions of the dissimilarity measures between the comparison and test groups.
Three data points from the test group are markedly distinct from the rest of the data points.
Discussion
- The study discusses the results of hierarchical clustering algorithms, visually represented via a dendrogram, a tree-like structure.
- Various clustering algorithms were compared, and average linkage clustering was found to be the most suitable for the analysis.
- The use of average linkage clustering helped identify compact clusters (specifically clusters A, B, and E), suggesting a lack of large group collusion.
- Additional visual tools like scatterplots and bar charts were employed to examine similarities among students within these clusters.
- The study used a reference group for comparison, validating the method’s effectiveness in detecting collusion, though limitations exist due to unknown ground truth.
- The approach not only aids in deterring cheating in unproctored exams but also contributes to the broader digital transformation of education, equipping us to handle unforeseen future challenges similar to the COVID-19 pandemic.
Further Research
- Future research could assess the long-term efficacy of the collusion detection method during exams and its impact on academic integrity and student behavior.
- Additional studies might focus on refining methods for gathering and analyzing supplementary evidence, with the ultimate goal of improving collusion detection rates. These efforts aim to provide a better understanding of the prevalence and extent of student collusion.